MetDecode: methylation-based deconvolution of cell-free DNA for noninvasive multi-cancer typing

Antoine Passemiers et al.

Presenter: Tony Liang


October 31, 2024

Background

Circulating cell free DNA1
  • Circulating free DNA (cfDNA) are DNA fragments released into bloodstream

  • Fraction of cfDNA could be released from cancer or tumor cells are circulating-tumor DNA (ctDNA)

  • Contains genetic and epigenetic changes, and could reveal the cells from which is originated

    • Identify different types of cancer

Detection the origin of cfDNA

Current cfDNA screening test can detect presence of abnormal signals but cannot tell tumor’s origin or cancer type or tissue of origin (TOO)

  • Computation methods use epigenetic markers like methylation profiles to deduce origin of cfDNA fragments
    • “Deconvolute” plasma cfDNA composition
    • Varying approach, probabilistic, linear model, matrix factorization, etc.
    • Most require a reference atlas

Existing methods limiations

  • Cannot deconvolute multiple cancer tissues
  • Do not account for missing variables due to incompleteness of atlas
  • Do not allow full deconvolution of all cfDNA components and estimate cell proportion only

MetDecode

The main deconvolution algorithm

\[ f(A) \quad = \quad \sum\limits_{i=1}^n \sum\limits_{k=1}^p \quad W_{ik} \quad \Big| \underbrace{R_{ik}^{\text{(cfdna)}}}_{(1)} - \underbrace{\sum\limits_{j=1}^m A_{ij} B_{jk}}_{(2)}\Big| \]

  1. Methylation ratios \(R^{\text{cfdna}}\)
  2. Reconstructed matrix, which approximates \((1)\)

Some math behind how MetDecode model this “unknown” contributor

To account unknown contributors in cfDNA mixture by adding \(h\) extra rows to \(R^{\text{atlas}}\)

\[ R_h^{\text{atlas}} = \begin{cases} R_k^{lb}, \quad e_k > 0 \\ R_k^{ub}, \quad otherwise \end{cases} \quad \text{where} \quad e_k = \text{median}_i \quad \Big( -R_{ik}^{(cfdna)} + \sum\limits_{j} \alpha_{ij} R_{jk}^{(\text{atlas})} \Big) \]

Evaluation metrics

Using 4 Samples of 10 replicates of 12 different tumor fractions in-silico mixtures of tumor gDNA and healthy cfDNA and evaluate:

  • Pearson Correlation Coefficient and Mean Sqaured Error to evaluate MetDecode estimations
  • Accuracy to evaluate multiclass cancer TOO prediction, and Cohen’s kappa to adjust for multiclass nature

Result 1

Results – What authors get back 1

abc

Results – What authors get back 2

abc

Results

Cancer type prediction comparisons based on highest cancer contributors
  • MetDecode with 1 unknown contributor performs best based on Cohen’kappa

  • All methods do equally poor for \(< 50\%\) accuracy when predicting all samples

  • Closer performance when looking at those \(19\) samples with tumor fraction \(> 3\%\)1

    • This is its \(84.2\%\) accuracy of correct TOO in \(16/19\) cancer cases

Conclusion of the authors

abc

Summary

How could one utilize cfDNA?

cfDNA epigenetic signatures can be used to deduce TOO or cancer type

MetDecode is an algorithm that estimates contributions and type of cancer in cfDNA sample

  • It models unknown contributors not present in the reference atlas

  • And accounts for coverage of each marker region to alleviate potential sources of noise

Limitations and Future Direction

  • Limited size of cfDNA samples for different cancer types

    • Total 93 samples, 4 being Cervical, 13 being Ovarian, rest are breast and colorectal
  • Another limit

  • Deconvoluting and defining the TOO will aid the oncologists in identifying the tumor and direct treatment

    • Specially when invasive examinations and radiological investigation are not ideal

My comments on this study

abc

Thank you

Adalsteinsson, Viktor A, Gavin Ha, Samuel S Freeman, Atish D Choudhury, Daniel G Stover, Heather A Parsons, Gregory Gydush, et al. 2017. “Scalable Whole-Exome Sequencing of Cell-Free DNA Reveals High Concordance with Metastatic Tumors.” Nature Communications 8 (1): 1324.